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We illustrate in terms familiar to modern day science students that: (i) an uncertainty slope 
mechanism underlies the usefulness of temperature via it's reciprocal, which is incidentally around 
42 [nats/eV] at the freezing point of water; (ii) energy over kT and differential heat capacity are 
"multiplicity exponents", i.e. the bits of state information lost to the environment outside a system 
per 2-fold increase in energy and temperature respectively; (iii) even awaiting description of "the 
dice", gambling theory gives form to the laws of thermodynamics, availability minimization, and 
net surprisals for measuring finite distances from equilibrium, information content differences, and 
complexity; (iv) heat and information engine properties underlie the biological distinction between 
autotrophs and heterotrophs, and life's ongoing symbioses between steady-state excitations and 
replicable codes; and (v) mutual information resources (i.e. correlations between structures e.g. a 
phenomenon and it's explanation, or an organism and it's niche) within and across six boundary 
types (ranging from the edges of molecules to the gap between cultures) are delocalized physical 
structures whose development is a big part of the natural history of invention. These tools might 
offer a physical framework to students of the code-based sciences when considering such disparate 
(and sometimes competing) issues as conservation of available work and the nurturing of genetic or 
memetic diversity. 
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I. INTRODUCTION 

The following are part of an evolving collection of 
notes drawn in part from lecture notes taken as a stu- 
dent, and in part based on experiences teaching statis- 
tical, modern, and introductory physics. The idea un- 

4 derlying the collection is that information theory since 
the days of ShannoniS^ sees entropy and other ther- 

5 modynamic concepts as nothing more than tools for ap- 
5 plying gambling theory (i.e. statistical inference) to 

5 physical systems with large numbers of similar and/or 

6 identical constituents. This paradigm shift 4 has al- 

6 ready worked its way into many advanced^Sii and se- 

7 nior undcrgraduateSiSiiSiiiiiSiiii^ textbooks on statisti- 

7 cal physics. The deeper understanding and wider ap- 
plication, as well as the simplifications 1 ^, that it af- 

8 fords to the introductory physics student are, with few 
8 exceptions 16 , not yet available in texts. The objective 
9 

11 

12 

15 

15 

15 

16 



here is simply to collect some of the snapshots offered by 
an information theory view, along with the calculation 
details and references that underlie them, for the benefit 
of teachers (as well for authors as markets develop for 
texts which put these insights to use). 



II. HOW HOT WORKS 

When you first heard it applied in the context of 
painful experience as a child, you likely gained appre- 
ciation for the meaning of "hot" without understanding 
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the mechanisms behind it's reputation. Our job here is to 
show you that hot, as bizarre as this may sound, means 
"low uncertainty slope for energy exchange" . This is an 
assertion that draws from the wide applicability of such 
slopes in gambling theory, which predicts for example 
that conserved quantities [verify that entropy's concavity 
is automatic] will likely flow from low to high slope sys- 
tems when given the chance. When the number of oppor- 
tunities for random energy exhange are numerous, such 
predictions are highly accurate, and may be realized on so 
rapid a time scale that significant energy transfer occurs 
before your body has time to react and avoid damage. 
When energy is the conserved quantity, the uncertainty 
slope is called reciprocal temperature or coldness p^. It 
approaches infinity at absolute zero, and goes from 39 to 
42 [nats/eV] as temperature decreases from room tem- 
perature to the freezing point of water. Here nats is a unit 
of information-uncertainty defined by ^choices = e^ nata 
just as bits are defined by ^choices = 2# blts . LASERs 
operate by taking advantage of inverted population (i.e. 
negative uncertainty slope) states to deliver energy most 
anywhere. 



A. Familiar relationships 

The systems of thermal physics traditionally involve 
molecules. Hence we first recall how to convert be- 
tween molecules N and moles n using the gas con- 
stant R = 8.31[J/(mole K)]. Since R is a product 
of Avogadro's number (essentially the number of 
atomic mass units in a gram) and Boltzmann's constant 
k B = 1.38 x 1CT 23 [J/K], one can write... 

Nk B = (N/H A )(H A k B ) = nR (1) 

In what follows, we will use be sticking mainly with the 
left hand side of this equation (i.e. the molecular rather 
than the macroscopic point of view), and be using a quan- 
tity k to determine the units used for temperature T. In 
the particular case when k is chosen to be fee, the tem- 
perature will be in historical units (e.g. [Kelvins]). When 
k = 1, or when we equivalently consider the quantity kT 
rather than T as the temperature, then we will say that 
temperature is in "natural units" . Below, we show that 
in natural units temperature may be expressed in Joules 
(or electron volts) per nat of mutual information about 
an object's state lost to the world around. 

Before examining this in more detail, let's consider 
a couple of useful elementary thermodynamic relation- 
ships: "equipartition" and "the ideal gas equation of 
state" . 

vN vn , , 

Equipartition: U = —kT = —RT (2) 

Here v is often called the number of degrees of freedom, 
or modes of thermal energy storage, per molecule. The 



equation relates extensive quantity U, the amount of 
randomly-distributed mechanical (kinetic and potential) 
energy in a gas or solid, to an intensive quantity: its ab- 
solute temperature T. We show later how this relation 
arises from the equation for a quadratic system's number 
of accessible states. Likewise, the equation of state for 
an ideal gas below follows from the assumption that each 
molecule in an ideal gas has a volume of V in which to 
"get lost" . 

Ideal Gas: PV = NkT = nRT (3) 

The above equation thus relates extensive quantity vol- 
ume V to intensive quantities: absolute temperature T 
and pressure P. 

Using these two equations, show that energy and tem- 
perature are quite different by proving to yourself that 
when you build a fire in an igloo, the total thermal ki- 
netic energy of the air inside is unchanged 1 -! (Hint: This 
is true even though the temperature of the air goes up.) 

B. Law zero with teeth 

To examine the way that thermal physics can give birth 
to such relations, a useful concept is the multiplicity or 
"number of accessible states" fi. Since for macroscopic 
systems this is often an unimaginably huge number (on 
the order of e HA ) , one commonly deals with its logarithm 
the uncertainty or "entropy" S — fclnil. (Look for more 
on the connection between uncertainties which depend 
on one's frame of reference, and physical entropy, later.) 
S is measured in information units [nats, bits, or J/K] 
depending on whether k is chosen to be {1, r-^, or k B } 
respectively. Knowing the dependence of multiplicity nd 
hence S on any conserved quantity X (like energy, vol- 
ume, or number of particles), shared randomly between 
two systems, allows one to "guess" how X is likely to 
end up distributed between the two systems. One sim- 
ply chooses that sharing of X which can happen in the 
largest number of ways, a mathematical exercise (try do- 
ing it yourself!) which for reasonable functions predicts 
that systems will most likely adopt subsystem A-values 
for which subsystem uncertainty slopes ^ are equal, i.e. 

dS- 

X equilibrated => 5i 0t maxmized => all— —equal. (4) 



1. Energy & equipartition 

This simple assertion yields some powerful results. 
Consider first the large class of macroscopic systems 
which can be classified as "approximately quadratic in 

v N 

thermal energy" . For these we can write O cx U~ , where 
as above N is the number of molecules and v is the num- 
ber of degrees freedom per molecule. Such systems in- 
clude low density gases, metals near room temperature, 
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and many other macroscopic systems at least in some 
part of their temperature range. Using c to denote a 
constant not dependent on energy U, one can then cal- 
culate uncertainty S and it's first and second derivatives: 

ftocf/^ ^| = ^lnJ7 + c. (5) 

The first derivative says that the energy uncertainty slope 
of such systems, a quantity predicted to "become the 
same for all subsystems allowed to equilibrate in thermal 
contact" , is = . This quantity is in historical par- 
lance known as reciprocal temperature, i.e. as y. One 
can thus solve this equation for energy to get the equipar- 
tition relation above: U = 

The second energy derivative of uncertainty is the neg- 
ative quantity = — ^£ . Hence systems with greater 
energy have lower uncertainty slope. As a result, energy 
flow during thermal equilibration goes from systems of 
lower to higher uncertainty slope, and equivalcntly from 
higher to lower temperature. This rate of uncertainty in- 
crease per unit energy gain (also called "coldness") thus 
behaves like a kind of hunger for thermal energy, just as 
gas pressure (below) can be seen as an appetite to ac- 
quire volume. By comparison, hot objects are like reser- 
voirs of excess thermal energy which has limited room 
to play. Hence the energy uncertainty slope (about 42 
nats/eV at room temperature, running to infinity as one 
approaches absolute zero) effectively drives the random 
flow of heat. The second derivative calculation above (by 
taking a square root) also allows one to estimate the size 
of observed temperature (or energy) fluctuations, as will 
be shown more quantitatively later. 



it as a product of ^ and As discussed above, the 
former is normally written as y, while ^ is nothing 
other than the change in energy per unit volume as in 
Work = PdV or in other words a pressure. Thus the 
calculation above tells us that ^ (the free expansion co- 
efficient for an ideal gas) equals Hence the ideal gas 
law! 

This volume uncertainty slope, in natural units at 
standard temperature and pressure, is about 2.5 x 
10 19 [nats/cc] at standard temperature and pressure: 
much less than the atomic density of around 10 23 atoms 
per cc for solids. The negative 2nd derivative predicts 
that for systems at the same temperature*, volume will 
"spontaneously flow" from systems of lower to higher 
pressure. Put another way, high pressure systems will 
expand at the expense of the low pressure neighbors, 
something that is quite consistent with observation. 

* A thermally-insulating barrier between two systems 
which allows "totally random sharing of volume" is dif- 
ficult to imagine. Easy to imagine is a rigid but mobile 
partition, dividing a closed cylinder into two gas-tight 
halves. In this case, gases on opposite sides will ad- 
just P to a common value on both sides of the barrier, 
thus establishing mechanical (momentum transfer) equi- 
librium with unequal densities and temperatures. The 
higher temperature (lower density) side will then expe- 
rience fewer, albeit higher energy, collisions. These will 
eventually result in thermal equilibration by differential 
energy transfer, even if we have to think of the wall as a 
single giant molecule with one degree of freedom, whose 
own average kinetic energy will "communicate" uncer- 
tainty slope differences between sides. 



2. Volume & ideal gas laws 

A system that has a simple volume dependence for the 
number of accessible states is the ideal gas. If the gas 
has sufficiently low densities that gas molecules seldom 
encounter one another, then the number of places any 
particular gas molecule may occupy is likely proportional 
to the volume V to which the gas is confined. Moreover, 
the independence of molecules in this low density (ideal 
gas) case means that the number of accessible states for 
the whole is simply proportional to the product of 

the number of states for each molecule separately, so that 
il oc V N . As above, we can then calculate uncertainty 
and its first and second derivatives: 

Q(xV N => | = NlnV + c. (6) 

The 1 st derivative is |f = ^ and the 2 nd derivative 

is fp§ = — yr- The negative value of the latter sug- 
gests again that volume is likely to spontaneously flow 
(when being randomly shared) from systems of lower un- 
certainty slope to systems of higher slope. 

But what is the physical meaning of free expansion co- 
efficient 7 = Wr? A clue might come from thinking of 



3. Particles & mass action 

The random sharing of particles (for example in a re- 
action) also gains it's sense of direction from the th 
Law of Thermodynamics described here. First, deter- 
mine how accessible states depends on the number of 
particles. Taking derivatives of uncertainty with respect 
to particle number for an ideal gas, one finds that ^ 
(also known as chemical affinity a = ^) approaches 
In j^jy , where "quantum concentration" Q is the number 
of particles per unit volume allowed by thermal limits on 
particle movement. Here -^jy is effectively the number 
of available non-interacting quantum states per particle. 
As particle number density n = N/V increases toward Q, 
affinity a (near 16[nats/particle] for Argon gas at stan- 
dard temperature and pressure) decreases toward 1, and 
ideality is lost. Ratios between Qi values in gas reac- 
tions, for the various components i of a reaction, yield 
an equilibrium constant that allows one to predict ratios 
between resulting concentrations rii. 

For example, if we consider the reaction A 2 + 2B <-> 
2AB, we expect equilibrium when the affinities of reac- 
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tants on both sides are equal, i.e. when 

o>a 2 + 2a B = 2uab- (7) 
Hence the equilibrium constant is 

v _ n AB Q\b fo\ 

riA 2 n% Qa 2 Q b 

where the middle term depends only on reactant con- 
centrations, while the last term is a function of ex- 
perimental conditions e.g. for a monatomic ideal gas 

Q = (2nmkT '//i 2 ) 3 ^ 2 where m is each atom's mass and 
h is Planck's constant. Thus the behavior of the con- 
centration balance as a function of temperature may be 
predicted. 



For a system allowed to share thermal energy U, vol- 
ume V, and particles N with its environment, the change 
of entropy can be written: 

(ii) 

The first term in parenthesis is 1/T, the second P/T, 
and the third — fx/T from our statistical definitions of 
the intensive variables. If we solve this for dU, while 
defining as flows of heat 5Q those energy changes NOT 
associated with changes in a specific extensive variable 
(like V or N), we get the most common "open system" 
version of the First Law, 



III. THE FIRST AND SECOND LAWS 

In addition to the th Law and some state equations, 
one can get the I s * and 2 nd Laws of Thermodynamics by 
combining gambling theory with conservation of energy 
and other shared variables. We first illustrate with some 
heuristic arguments, and then present some more general 
results with help from the maximum entropy formalism. 

For an isolated system (one cut off from ourselves and 
the rest of the world) , the first and second laws are intu- 
itive. Conservation of energy U requires that 

dU to t/dt = 0, (9) 

and intuition suggests that our uncertainty S = khi[Q] 
about the microscopic state of a system (while we're cut 
off from it) is unlikely to decrease with time. Although 
this may sound like it makes our presence as observers 
crucial to the time evolution of a system from which we 
are isolated, it does not. As we show below, it is instead 
equivalent to saying that the mutual information between 
isolated systems is not likely to increase with time. Thus 
it is a prediction about the behavior of the larger system 
of which these subsystems are a part. It does have a 
direct impact on how our assertions about that system's 
state as a function of time will correlate with what we 
find, should we decide to terminate the isolation at some 
point and look inside. 

The classical example of such irreversible change is the 
free expansion of a gas confined to one half of an evac- 
uated volume, upon failure of a partition dividing that 
volume in half. If the gas is ideal, in fact, the number of 
accessible states per particle doubles so that the entropy 
increase is one bit per particle. Such isolated system en- 
tropy increases are called irreversible, and hence we can 
write: 



dt dt ~ v ' 

Equation [5] is rigorous within limits of the energy-time 
uncertainty principle, while equation I1UI is only a proba- 
bilistic assertion, albeit one often backed up by excellent 
statistics! 



dU = 6Q in - SW out . (12) 

Here SW = PdV — /id/V denotes work done by the 
system on its external environment as it gains volume 
or loses particles. The resulting equality of 5Q with 
T(dS — dSi rr ), rearranged, yields the open system form 
for the Second Law: 

dS = 5 -%l + 5S irr , where^ > 0. (13) 
T dt 

Here of course W ou t, Qi n and Si rr are defined only in 
the context of their respective pathways for energy or 
entropy change, and are not themselves functions of the 
state of the system at all. The use here of 5, instead of 
d, to represent their differentials is thus because those 
differentials are mathematically inexact^. 

Note that in the process we have also shown, for 
reversible changes, i.e. when SSi rr = 0, that P = 
~(dU/dV)sN is a measure of force per unit area, and 
[i = (dU/dN)sv is a measure of energy per particle. Of 
course, with added terms of the same form equation 1131 
can accomodate many simultaneous kinds of work and 
particle exchange. 

Given the 1st and 2nd Laws, along with f2(£7, V) for 
an ideal gas an its consequences, a large number of sim- 
ple but interesting problems can be considered by intro- 
ductory students in detail. These include a host of gas 
expansion problems e.g. isothermal, isobaric, isochoric, 
adiabatic, and free), a set of two-system problems which 
include information loss during irreversible cooling e.g. a 
cup of coffee whose initial net surprisal from subsubsec- 
tion llV B 4l is w(7„(p--l-ln^-), attempts by Maxwell's 
Demon at reversing the heat flow process, and the sym- 
metric vacuum-pump memory (or isothermal compres- 
sor) discussed in subsection IV Bl Second Law limits on 
converting high temperature heat to room temperature 
in the presence of an external low temperature reservoir 
also yield suprising results^. If the dependence of ft 
on N can be introduced, as discussed above entropies of 
mixing and chemical reaction rates may be considered as 
well. 
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IV. THE MAXENT "BEST-GUESS" MACHINE 

If one has information over and above the state inven- 
tory needed to determine f2, it can be used to modify 
the assignment of equal a priori probabilities e.g. used 
above when we maximized uncertainty for the ("micro- 
canonical" ) case in which extensive variables (like energy, 
volume and number of particles) are all considered inde- 
pendent variables or work parameters^. To do this, one 
first writes entropy in terms of probabilities by defining 
for each probability pi a "surprisal" Si = khi[l/pi], in 
units determined by the value of k. The average value of 
this surprisal reduces to S = k In Q when the pi are all 
equal, yielding a generalized multiplicity fl = e s l k . Note 
that the relationships described here will likely translate 
seamlessly into quantum mechanical applications 3 . 



A. The problem 

Our (at first glance benign) task is to maximize average 
surprisal... 

i— 1 i— 1 

...subject to the normalization requirement that the 
probabilities add to 1, i.e. that... 

n 

X> = 1, (15) 

i=i 

...along with the "expected average" of R constraints 
which take the form... 

u 

E r = (e r )=J2Pi e n^re{l,R}. (16) 

A simple example (useful for free electron and neutron 
gas models) can be thought of as that of a weighted coin 
which lands "heads up" (to be specific) six tenths of the 
time. In that case, there would be = 2 states, and 
we would have R = 1 e.g. with en = 0, e±2 = 1, and 
E% = 0.6. A more general example, that of the max- 
ent calculation underlying the Bell curve, is presented in 
Appendix El 



B. The solution 

The Lagrange method of undetermined multipliers 
tells us that the solution for the zth of probabilities 
is simply... 

^ie-E^^^iejl,!]}, (17) 



where partition function Z is defined to normalize prob- 
abilities as... 

n 

Z = ^ e -£. =1 A " e -. (18) 

i=l 

Here A r is the Lagrange (or "heat") multiplier for the 
rth constraint, and e r i is the value of the rth parameter 
when the system is in the ith accessible state. For exam- 
ple, when E r is the energy U, X r is often written as pf . 
Values for these multipliers can be calculated by substi- 
tuting the two equations above back into the constraint 
equations, or from the differential relations derived be- 
low. 

For example, equation ^| gives Z = 1 + e~ Al for 
our coin problem, so that pn — ^, pyi = e z 1 and 
Ei = j^pr- Solving in terms of Ei we get Ai = 

ln(^ - r/= -.405, Z = 2.5, pn = 0.4, p u = 0.6, 
and Ai-Ei (a quantity which will prove useful later) is 
-.243. 

The resulting entropy, maximized under specified con- 
straints, is... 

S R 

- =ln[Z]+ VA r £ r . (19) 

nj 

r=l 

A useful quantity which has been minimized in 
constraint-free fashion, by this calculation, is the "avail- 
ability in information units" 

R 

A = -k\nZ = k^ KE r - S. (20) 

r=l 

For example, in the coin problem the maximized en- 
tropy is S/k = ln[Z] + XiEi = 0.673 nats, and the mini- 
mized availability Ajk = XiEi — S = —.916 nats. Since 
constrained entropy maximization minimizes availability 
without constraints, assuming of course that the e r i co- 
efficients for all values of r and i are held constant them- 
selves (cf. page 46 of Betts and Turner), gambling theory 
most simply states that the best guess (in the absence of 
other information) is the state with minimim availability. 

Generalized availability in turn can be seen as the com- 
mon numerator behind a range of dimensioned availabil- 
ities, with the properties of thermodynamic free energy 
(one for each variable of type r). These are defined as... 

In Z x — ^ At/ S , 

A r = ^— = -f E - - rr< w e ^ R y- ( 21 ) 

u—l 

Standard thermodynamic applications include micro- 
canonical ensemble calculations like those with which we 
began this note (R = so that A = —S), the canoni- 
cal ensemble for systems in contact with a heat bath at 
fixed temperature (there R = 1 and Ei is energy U so 
that Ai = -cy, and Ai = U — TS is the Helmholtz free 
energy), the pressure ensemble {R = 2 with Ei is energy 
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U so that Ai = jro, -E2 is volume V so that A2 = 
and Ai = U + PV — TS is the Gibbs free energy), and 
the grand canonical ensemble for "open systems" (same 
as pressure except that E2 = N so that A2 
A\= E — [iN — TS is the grand potential). 

Note that this calculation requires an assignment of the 
e r i for all possible states, but it involves no other physical 
assumptions (like equilibrium or energy conservation) . In 
the sense given here, systems for which the recommended 
guess is not appropriate are systems about which more 
is known (e.g. about constraints or state structure) than 
is taken into account. Types of constraints other than 
the averages used in equation ^| e.g. correlation infor- 
mation constraints like those discussed later, are likely 
worth learning to put to use, but that is not done here. 



1. Physics-free laws 1 & 2 



We now begin to look at the effect of small changes. 
For example, the e r i are often not themselves constant, 
but depend on the value of a set of "work parameters" 
X m , where here m is an integer between 1 and M. For ex- 
ample in the Gibb's canonical ensemble calculations for 
an ideal gas, the energies of the various allowed states 
may depend on volume V or particle number N. Follow- 
ing Jaynes we can define work-types for each constraint 
r in terms of the rate at which E r changes with X m : 



J 



SW r = 



M 

E 

rn — 1 



dE r 

UxZ 



sx m = - 



M 

E 

rn—1 



5X, 



n 

E 

i=l 



r 



dpi 
! dX 71 



de r . 

; dx: 



,Vre{l,i?} 



(22) 



Note that the various "work increments" SW r have the 
same units as the corresponding contraint parameters E r , 
and unless otherwise noted that partials are taken under 
"ensemble conditions" (i.e. holding constant all unused 
"control" parameters X m and A r ). 

It's also useful when discussing work to define the gen- 
eralized enthalpies 



a couple of familiar differential relationships follow.. 



and 



) r - SW r = ( e " S Pi + Piferi) = 6E r ,\/r e {1,R}, 

(26) 



i=l 



II,. 



M 

E 

rn—l 



dE r 
dxZ 



X m ,Vr€{l,R}, 



(23) 



For example, in the canonical ensemble case mentioned 
above, with volume V the only allowed work parameter, 
equation 1221 becomes simply 8W\ = PSV, and equation 
US becomes H x = U + PV. 

In equation 1221 we have left open the possibility that 
changes in X m may alter probabilities directly, e.g. by 
making new volume available for free expansion rather 
than simply via their effect on the state parameters 
e r j- This allows us to mathematically incorporate "irre- 
versible" changes in entropy by averaging this term over 
all work parameters and all constraints... 



SSi 



R 

r=l 



M 

E 

m — 1 



SX, 



Q 

E< 

i=l 



( d Pi 



\dX r 



(24) 



If we further define "heat increments" SQ r of the rth 
type as... 



n r 



5Q r = ^en^SXr(^\^re{l,R}, (25) 



i—1 u—1 



E X r6Q r 



SSi 



= E^E Are " = 



r=l 



i=l 



ss_ 
k ■ 



(27) 



These are more general forms of the open system I s * 
and 2 nd Laws (equations El an d l*Hfl> , based purely on 
statistical inference from a description of allowed states. 
The familiar physics only arrives, e.g. for the canonical 
ensemble case when R = 1 , if we further postulate that 
Ei represents a conserved quantity in transfers between 
systems, and that SSi rr > 0. 



2. Symmetry between ensembles 

Different thermodynamic "ensembles" often relegate a 
work parameter X m to the status of a constraint E r by 
expansion of the state sum to include all possible values 
for the work parameter. The classic example is the pres- 
sure ensemble mentioned above, in which the traditional 
work parameter volume V is introduced as a constraint 
enabling, for example, a study of volume fluctuations. 
The symmetry of the equations with respect to these 
quantities might be better seen if we define M "work mul- 
tipliers" J m , analogous to the R "heat multipliers" A r , as 
averages over all constraints E r of the rate at which the 
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e r j depend on the work parameters to which they corre- 
spond, i.e. 



the partial of E r with respect to A r , and hence that 



R Q 

Jm = K Pi 

r—1 i—1 

Then we can also write. 



9&n 

'ex. 



,Vme{l,M} (28) 



m SS R SA R 

J m sx m = — — 2^ ^rSE r = — — + E r SX r . 

m—1 r—1 r—1 

...yielding a harvest of partial derivative relationships for 
entropy and availability. 

These include the connection between multipliers (like 
reciprocal temperature) and entropy derivatives: 



_(dS/k\ 

Jm -\dx m ) 



X a + m ,E, 



(30) 



E 3 ^ r ,X„ 



which for example allow one to show that integral heat 
capacities like -j^p are logarithmic entropy derivatives (or 
"multiplicity exponents" ) of the form U taken under 
no- work conditions. Eauation l27l allows one to show more 
generally that integral heat capacities like are also 
multiplicity exponents of the E ri and that differential 
heat capacities taken with SSi rr — 0, e.g. of the form 
jjjp , are in a complementary way multiplicity exponents 
of the A r , e.g. of the form T^. 

The "availability slope" partials relate fluctuating pa- 
rameters J m and E r to control parameter partials taken 
under ensemble conditions, i.e. 



fdA/k\ 
Jm - \dX m ) 



fdA/k\ 



(31) 



,x„ 



These are the starting point for our assertion, in the ab- 
stract, about the general usefulness of uncertainty slopes 
in problems of statistical inference. The next section 
takes the assertion a step further, by providing insight 
into fluctuations and correlations. 



3. Fluctuations and reciprocity 

Again following Jaynes and taking partials under en- 
semble constraints, given 

|^ = Pi (E r - eri), Vi G {1, n} k Vr G {1, R}, (32) 
oX r 

and 

R „ M q 

^^^l+E^^'^f 1 '^ ( 33 ) 
r—1 m—1 

one can show quite generally that the covariance between 
parameters E r and E s (namely (e r e s ) — (e r ) (e s )) is minus 



2 

cr E r E, 



dE r 
d\ s 



dE s 



d 2 A/k 
dX r dX s 



Vr,s G {1,-R}. 



(34) 

The latter equality gives rise to the Onsager reciprocity 
relations of non-equilibrium thermodynamics. 

For the special case when r = s, the above expression 
also sets the variance (standard deviation squared) of r 



to ai = 



dEr 



Since the left hand side of this equation 
seems likely positive, the equation says that temperature, 
for example, is likely to increase with increasing energy. 
This turns out to be true even for systems like spin sys- 
tems which exhibit negative absolute temperatures, pro- 
vided we recognize that negative absolute temperatures 
are in fact higher than positive absolute temperatures i.e. 
that the relative size of temperatures must be determined 
from their reciprocal (1/kT) ordering. Since this quan- 
tity is also proportional to heat capacity, equation 1341 
also says that when heat capacity is singular (e.g. durng 
a first order phase change), the fluctuation spectrum will 
experience a spike as well. 

Although we only conjecture based on symmetry here, 
similar relations may also obtain for the work multipliers, 
e.g. 



dJ n 

dX„ 



dJ„ 



d 2 A/k 



8X n dX m dX n 



Vm, n G {l,M}. 

(35) 



as well as for hybrid multiplier covarianccs. 



4- Net surprisal & availability 

Changes in availability under ensemble constraints, as 
in the derivatives above, can also be seen as whole system 
changes in uncertainty relative to a reference stat o 20 i 21 , 
i.e. as changes in net surprisal. Here we define net sur- 
prisal as 



-fc5>ln(^)>0, 



(36) 



where the p i are state probabilities based only on ambi- 
ent state information, while the pi take into account all 
that is known. The inequality follows simply since each 
set of probabilities contains only positive values that add 
to one. It then follows from equations IT71 ITHl and ITT)1 that 



J ne£ _ / u u o \ 

~k~ = ~ [ k ~ T> 



R 

E 

r=l 



(E r — E ro )X 

r 



(37) 



provided our system's deviation from the reference state 
(here no longer infinitesimal but finite) does not involve 
changes in the work parameters X m , since this would 
constitute a change in the problem (e.g. the energy level 
structure) being considered. If the E r are conserved in 
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transfer between systems, net surprisal is simply the en- 
tropy increase of system plus environment on equilibra- 
tion to ambient, with the surprisal value of GXC6SS E r 
simply the ambient uncertainty slope A ro (e.g. 1/T for 
energy U). Using equation 1201 we then get 



L net 



A 



Ao 
k 



R 

E 

r=l 



(A r — \ ro )E r . 



(38) 



Thus near ambient conditions, derviatives of availabil- 
ity under ensemble conditions are also derivatives of net 
surprisal. 

For example, systems in thermal contact with an am- 
bient temperature bath may be treated as canonical en- 
semble systems with constrained average energy. Thus a 
temperature deviation from ambient T a for a monatomic 
ideal gas gives for that system — |iVG[^-] where 
@[x] = x — 1 — lnx > 0. If that system is also in con- 
tact with an ambient pressure bath (i.e. able to ran- 
domly share volume and energy), volume deviations add 
iV0[^r] to the foregoing. For grand canonical systems 
whose molecule types might change (e.g. via chemical 
reaction), one instead adds NjQ[-jP-] for each molecule 
type j whose concentration varies from ambient. 

Not only is the concept of net surprisal simply repre- 
sented in context of the general maxent calculation, it 
also offers simplifying insight into thermodynamic pro- 
cesses. For example, an at first glance counter-intuitive 
problem offered to intro physics students at the Univer- 
sity of Illinois asks how cold the room must be for an oth- 
erwise unpowered device to take boiling water in at the 
top, only to return it as ice water with a bit of ice therein 
at the bottom. Since the 2 nd law allows conversion of one 
form of net surprisal reversibly into another (famously 
without a clue how to do it in practice), one can use 
the fact that the function 9 above works for "quadratic 
systems" in general to set C v Q[ ^ hot ] = C u 8[ T Tice ] and 
solve for T room . Thus with net surprisal in hand tlie prob- 
lem becomes both conceptually, and analytically, simple. 

Of course, the net surprisal measure is not only rel- 
evant to inference about systems for which physically 
conserved energy is of interest (i.e. thermodynamic sys- 
tems). In fact, one might conjecture that it meets the 
requirements for an information measure proposed by 
Gell-Mann and Lloyd 22 , and that it includes the Shannon 
information measure discussed there as a special case. By 
way of a specific application, net surprisal's usefulness for 
quantifying the amount of information students bring to 
an exam is illustrated in Appendix Thus armed with 
statistical inference tools that underpin traditional ther- 
modynamic applications, but which require no physical 
assumptions a priori short of a state inventory, we now 
take a look at some of the more complex system areas 
where applications (already underway in many fields) will 
likely continue to develop. 



V. STEADY-STATE ENGINES 

Begin by considering within our larger isolated sys- 
tem the possibility of "steady-state engines", i.e. de- 
vices which operate in some fashion on their surroundings 
while remaining (to first order) the same themselves be- 
fore and after. If we refer to Ui and Si respectively as 
the steady state energy and entropy of engine i, then the 
change with time of these values will be (by definition) 
negligible. Hence such steady state engines contribute 
little or nothing to time variations in total system en- 
ergy and entropy. Hence the 1st and 2nd Laws applied 
to engines plus environment means that the same equa- 
tions apply to the energy U and entropy S external to 
such steady-state systems alone. 

Since energy and work can be exchanged in both ways 
between our engines and their environment, it is conve- 
nient to write: 

dU = (SQout + SWout) - {5Q in + SW in ) = 0, and (39) 



dS = 



To 



T 



(40) 



Here the terms with the subscript "in" represent flows 
of energy into our steady state engines, while the terms 
with the subscript "out" represent flows of energy out 
from those same engines. These equations open the door 
for students to a wide variety of "thermodynamic pos- 
sibility" calculations. Heat and information engines will 
be our focus here. 



A. Heat engines &: biomass creation 

Heat engines as in Fig. ^ ar e generally defined as 
steady-state systems which take in heat from a high tem- 
perature (e.g. a combustion) reservoir, and return that 
energy as heat and "ordered energy" to a lower tempera- 
ture (e.g. ambient) reservoir. Car and steam engines fall 
into this category, if we allow burning fuel to be consid- 
ered their source of high temperature heat. 

The equations above also work with forms of plant life 
which take in sunlight (high temperature heat) and store 
chemical energy (i.e. work) in plant biomass (e.g. in 
cellulose, carbohydrates, proteins, and fats). In this case, 
PdV work may be ignored, and SW ov t — SWi n becomes 
the change in chemical potential times the number of 
molecules whose state is changed by solar irradiation. 
Ecologists refer to organisms that do this as autotrophs 
( "self-nourishers" ) or primary producers 2 ^. 

The exhaust (i.e. low temperature) reservoir for most 
heat engines is the ambient environment. Refrigerators 
and electric heat pumps are by comparison heat engines 
run in reverse, i.e. they take in work and heat from 
a low temperature reservoir, exhausting it to a warmer 
ambient. All of the exhausted heat is eventually radiated 
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FIG. 1: Schematic for a heat engine which brings in heat en- 
ergy at high temperature and returns heat at low temperature 
along with energy available for work. One example, applica- 
ble to automobile engines, is the lifting of a weighted piston 
with hot gas which cools on expansion. Another is the stor- 
age of available work in plant biomass via photosynthesis of 
radiation from the sun. 



at around 300K back into space by the earth, letting us 
see the earth itself as a steady-state heat engine as well. 

Solving equations |2U and and assuming that is 
positive, we get the familiar upper limit on energy avail- 
able for work that a Carnot engine (i.e. an engine whose 
heat flows out of and into a pair of fixed-temperature 
reservoirs) can produce: 



SW 



< 



T 



(41) 



Most real heat engines have efficiencies (conversion frac- 
tions) which are beneath this because of irreversibilities 
during operation. 



B. Information engines & us 

The concepts of thermodynamics have been tradition- 
ally honed in systems near or approaching equilibrium, 
and the entropy of homogeneous systems at equilibrium 
is an extensive quantity like energy or volume or number 
of particles. However, the maximum entropy best guess 
machine is much less restrictive about the kinds of sys- 
tem to which it applies. In particular, uncertainty about 
the state of a system in general depends not only on what 
we know about each component of a system, but what 
we know about the relationship between components. 

For example, if 10 white and 10 black marbles are dis- 
tributed between two drawers A and B, then one has 
S = 20/cln2, or 20 bits, of uncertainty about the drawer 
assignment of these marbles (i.e. one true-false question's 



worth, or bit, of uncertainty per marble). However, if one 
is given as true the statement that "marbles in any given 
drawer are all the same color" , the uncertainty about the 
drawer assignment of marbles is reduced to A; In 2 or one 
bit of uncertainty. Even though a bit (literally) of un- 
certainty remains about the drawer assignment for each 
individual marble, as before, the total uncertainty has 
now been decreased by the mutual information in that 
statement, or 



M 



total • 



(42) 



In our example, there are N ss = 20 subsystems, and 
this equation shows that M = 20 — 1 = 19 bits of mutual 
information are contained in the statement quoted above! 

Mutual information (e.g. that two spins are correlated, 
or that two gases have not been well mixed) plays a well- 
known role in physical systems as well2^2£*2&, with re- 
cent focus in particular on it's impact in nucleic acid 
replicationSLSi and in quantum computing22^S. For ex- 
ample, Grosse et al^i use intra-molecule mutual informa- 
tion to distinguish coding and non-coding DNA, instead 
of autocorrelation functions, because the former does not 
require mapping symbols to numbers, and because it is 
sensitive to non-linear and linear dependences. Although 
constraints of this sort may be incorporated into the max- 
ent formalism (cf. Appendix[BJ) , we take the possibility of 
such correlations into account here by simply modifying 
equation 1131 to read 



SS 



T 



SAL, 



internal 



+ 5 Si, 



(43) 



This makes the 2 nd Law relevant to engines whose pri- 
mary function concerns tasks not explicitly involving 
changes in energy, such as the job of putting "the kids' 
socks in one pile and the parents' socks in another", or 
the challenge of reversible computing. When 5Mi n t er nai 
changes are important, however, note that entropy can- 
not be considered an extensive quantity like U, V, and N, 
since the total uncertainty S about the state of a system 
may be less than the sum of the uncertainties about the 
state of its constituent parts. 

This strategy reflects recent thinking about the en- 
ergy cost of information in generalizing the Maxwell's 
demon problem^. Zurek^ among others suggests that 
the only requisite cost of recording information about 
other components in a system is the cost of preparing the 
blank sheet (or resetting the measuring apparatus prior 
to recording with it. Moreover, the minimum thermody- 
namic cost, in energy per unit of correlation information, 
is simply the ambient temperature T. 

A classic example^ of this is the isothermal compressor 
for an N-atom gas [2] taken for the case when N=l. The 
system requires thermalization of no less than kT In 2 of 
work, in return for a single bit of correlation information 
concerning the location of the atom. That correlation 
information in turn can be used subsequently to perform 
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FIG. 2: A symmetric bi-partitioned cell for the isothermal 
compression of an N-atom ideal gas into either its left or 
right half, perhaps first discussed by Szilard, which serves 
as a physical system about whose state mutual information is 
available for a well-defined price in free energy. If one is fur- 
ther provided with some mechanism (e.g. spectroscopic) for 
reading its state, it may also serve as a mechanically operated 
single-bit memory. The "setting" process involves removing 
the barrier between compartments, using the piston on one 
side to relocate all atoms into the opposite half and then re- 
turning the barrier before returning the piston to its original 
position. The required work is Win = NkT In [2]. Its reset 
status may be defined as true if we know that the atoms are 
located in the right half of the container, and false if we don't 
know this to be the case. 

with arbitrarily small energy cost the same task of locat- 
ing the atom on a desired side, by simply rotating the 
cylinder by 180 degrees as needed! 

The AM term also lets us see the isolated system Sec- 
ond Law (Eqn 110(1 in a new light. Begin with a sys- 
tem A with fi total accessible states, so that uncertainty 
about A is at most S[A] = fclnfL Then consider an 
observer B, with sufficient added information about A 
to limit the number of accessible states to T < fi. Ob- 
server B therefore has conditional uncertainty about A 
(see Appendix |B|) of Sb[A] = fclnT. What we can learn 
about A by knowing B is then the mutual information 
M[A,B] = S[A]+S[B]-S[AB] = S[A]-S B [A] =fcln§. 
If the basic structure of system A from which was cal- 
culated remains intact, then the isolated system 2 nd Law 
assertion that observer uncertainty about isolated A can 
only stay constant or increase (i.e. that dSB[A]/dt > 0) 
implies also that the mutual information between two 
isolated systems (here M[A,B]) can only decrease. 

Now we consider steady state engines whose function 
is to produce mutual information or correlations between 
two systems, as in Fig OH These correlations might, for 
example, be marble collections sorted by color, a faithful 
copy of a strand of DNA, or dots on a sky map corre- 
sponding to the position of stars in the night sky. Our 
first and second law engine equations (from 1391 and 1401 
with mutual information), become 

dU = (5Q out ) - {SW m ) = 0, and (44) 




FIG. 3: Schematic for an information engine which in the pro- 
cess of thermalizing energy available for work creates mutual 
information (also called correlation information, information 
in structure, or negentropy—) . One example is the isothermal 
compression of an N-atom gas into the left half of a compart- 
mented container with a vacuum pump, while thermalizing 
at least NkTln[2] of work energy and creating some exter- 
nal record of the occurrence. Another example might be the 
resetting (erasure) of a used sheet of paper or a molecular 
template, as a first step in the encoding or replication of a 
message. 



SS = — ^ — SM externa i = SS irr > 0. (45) 

^ out 

Eliminating Q ou t from these two equations yields 

5M external < ■ (46) 

This means that information engines can produce no 
more mutual information than their energy consumption, 
divided by their ambient operating temperature. In bi- 
nary information units, this amounts to producing about 
55 bits of information per eV of thermalized work at room 
temperature, and around 60 bits per eV of energy if op- 
erating near the freezing point of water. 

Cameras, tape recorders, and copying machines may 
be considered such information engines, as are forms of 
life which take in chemical energy available for work from 
plant biomass, and thermalize that energy at ambient 
temperature while creating correlations between objects 
in their environment and their survival needs, and in the 
form of persistent DNA sequences, behavior redirections, 
songs, rituals, books, and sets of ideas). Living organisms 
which do not qualify as heat engines, but which fit this 
description, are known by ecologists as heterotrophs 2 ^, a 
category which includes most non-photosynthetic organ- 
isms (including humans). 

For a human being consuming 1.3 x 10 7 joules (around 
3000 kcal or 7 twinkies) per day, and viewed as an infor- 
mation engine, this implies an upper limit on production 
of 4 x 10 27 bits of mutual information in our environment 
per day. (Note: This includes non-coded correlations, 
like laundry which has been sorted into piles, as well as 
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FIG. 4: Life's Energy Flow: The top half represents some 
of the primary processes involved with energy flow, while 
the bottom half illustrates significant physical repositories for 
life's energies, and paths for conversion of energy from one 
form to another. 



coded correlations such as a map of the night sky as it 
looked at 11pm from your backyard.) Much like con- 
servation of mechanical energy limits on speed in roller 
coaster rides, and Gauss's law limits on net charge within 
a volume based on field measurements at the boundaries, 
this assertion may well stand quite independent of the 
detailed biochemistry going on inside. Alas some of us, 
in practice, have trouble putting a one page report of 
new observations in a file per week! Although individual 
metazoans in fact bolster the correlation information in 
their environment on many levels (see the next section), 
even when unassisted by other sources of energy avail- 
able for work, it is likely that the above inequality is not 
a major bottleneck for most. 



VI. EXCITATIONS AND CODES 

Figure 0] illustrates the flow of energy available to life 
on earth, much of which began as high temperature ra- 
diative heat given off by the sun, subsequently converted 
to chemical potential energy (heat engine style) in the 
form of plant biomass^. Much work might still be done 
to quantify these flowsS, since the flow rate through 
biomass is seldom even considered outside of classes in 
ecology. For example, many introductory physics texts, 
and even the world almanac, ignore its size entirely. 
Hence student projects on the size of these flows, at var- 
ious times and places, might be interesting and enjoy- 
able. Likewise for projects which examine the involve- 
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FIG. 5: Life's Stores of Availability: Horizontal bars represent 
inward- looking (intra or "yin") correlations, while vertical 
bars represent outward-looking (cross-boundary or "yang") 
correlations. This breakdown seems to work reasonably well 
to categorize by domain both the types of correlations that 
exist, and the kinds of ideas (i.e. memetic replicators) used 
to help maintain them. 



ment of various consumables and activities in the de- 
picted streams e.g. the availability cost of a hot dog, or 
an aluminum can. 

Some of the "ordered energy" outputs from the heat 
engines described in Figure 0] are eventually thermalized 
(e.g. in forest fires or the burning of fossil fuels), but not 
all of it is irreversibly thermalized. In other words, some 
of the free energy made available by plants is converted 
to non-energy related correlations between organism and 
environment, and some is converted to to internal corre- 
lations within living things. 

Organism/environment correlations include, for exam- 
ple, cell membranes that separate the contents of one- 
celled organisms from the fluids surrounding. They differ, 
depending on the nature of the ambient to which a given 
organism is adapted. Similarly, the woody trunks of trees 
don't merely store chemical energy for later combustion, 
but instead point in a direction which allows subsequent 
leaf growth to have better access to the light of the sun. 

Correlations internal to organisms include catalysts 
(often amino acid enzymes) which guide the spending of 
the cell's energy coinSi (adenosine triphospate molecules) 
not only toward nourishment and other external goals, 
but toward it's use in the process of cell replication. The 
enzymes themselves are typically constructed from amino 
acid sequences which fold in solution into secondary and 
delicate tertiary structures which are crucial to catalyst 
structure and function. 



In fact, information on these correlations within cat- 
alyst molecules, resulting in part also in correlations 
between organisms and their environment, apparently 
proved so important that a digital means (nucleic acid 
codes) to store mutual information on these correlations, 
still in widespread use today, was developed several bil- 
lion years agoSL^. Note that the word digital here refers 
to ways to store mutual information in which bit-wise 
fidelity of the replication process can be checked after 
the fact. This is distinct from analog forms of recording, 
like the storage of images on film, where accuracy on the 
microscopic scale is lost statistically in the grain struc- 
ture of the film, as one moves to increasingly smaller size 
scales. 

This ancient invention of digital recording more or 
less formalized a now long-standing symbiosis between 
steady-state excitations (in particular organisms which 
operate in-part by reversibly thermalizing an inward- 
flowing stream of energy in the form of available work) 
and replicable codes. This excitation-code symbiosis, of 
course, involves mutual information managed (stored, 
replicated, and applied to enzyme manufacture) by bio- 
logical cellsSi. Now memetic replicators 3 ^^, i.e. ideas 
which began as sharable patterns stored in the neu- 
ral nets of animals 41 *^, are in the process of going 
digital 43-44 , thus adding a second level to life's symbio- 
sis with replicators. The unconscious struggle for hege- 
mony over organisms, between these two replicator fam- 
ilies, might in a way be seen as a battle between "sword" 
and "pen" in which (strangely enough) organisms are the 
spoils of war. At the very least, it seems likely that un- 
der some conditions the interests of organisms, and the 
interests of codes, don't commute. 

Naturally our "organism-centric" vantage point 
prompts us to miss, at first glance, the way that organ- 
isms serve codes in a given process^. We might some- 
times even miss the distinction between "our ideas about 
the world" (those replicable codes) and "the world itself" 
(a complex excitation with deep internal correlations) 45 
as though we are in danger of "knowing everything" with 
a completed map of the universe in our minds— A 
closer look at nature, however, reveals that true cloning 
of internally-correlated excitations (e.g. like qubits^) 
may be impossible in principle as well as in practice. 

A natural way to "illustrate and inventory" the stand- 
ing crop of correlations associated with life, while rec- 
ognizing boundaries between replicator-pools as well as 
simpler physical boundaries (like cell walls and individ- 
ual spaces), is illustrated in Figure Again, students 
might find it enjoyable to think about ways to inventory 
this standing crop of correlations, at various places and 
times. Although in principle each bar in the figure could 
be quantified in "bits of standing availability" , neither 
the means nor the motivation for doing this objectively 
are clear at this point. However, just picturing qualita- 
tively the state of these correlations and the boundaries 
with which they associate, as physical elements in the 
world around, might be worthwhile (cf. Fig|SJ). 
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FIG. 6: An expanded schematic of some boundaries alone, 
drawn from FigQ3 This suggests for example that the idea 
sets for interaction between cultures (i.e. when one has to take 
into account more than one book in the figure, or meme-pool) 
may be quite different than those for interactions between 
heirarchies in a given culture. Guidelines for professionalism 
in the workplace, and political correctness, as irrelevant as 
they may seem in the context of one culture, might be evi- 
dence of behavior correlations developing along those lines. 



VII. THE NATURAL HISTORY OF INVENTION 

Histories of emergent phenomena, like Marshall 
McLuhan's "Gutenberg Galaxy"—, Konrad Lorenz' "On 
Aggression"™, Margulis & Sagan's "Microcosmos"— , 
Jared Diamond's "Guns Germs and Steel"—, David At- 
tenborough's Special on Birds 4 ^, and Ward & Brownlee's 
"Rare Earth"^& (in broad strokes at least) simplify when 
outlined in terms of the two manifestations of general- 
ized availability depicted in Figures 0] and |5j (i) "or- 
dered" or free energy, and (ii) mutual information, re- 
spectively. These two themes repeatedly intertwine in a 
non-repeating drama that involves partnership between 
replicable codes (which incidentally include the above 
two concepts), and what physicists might call steady- 
state excitations busily converting energy and informa- 
tion from one form to another. This history shows po- 
tential for providing a neutral perspective, grounded in 
established physical and logical principles, on numerous 
important and sometimes contentious issues. By pro- 
viding context both for such issues and our reactions to 
them, it might catalyze constructive dialog. It also sug- 
gests elements of a natural history, informed to interdis- 
ciplinary connections emergent only in the last century. 

A larger "timeline of concept-relevance" for ideas 
might thus, for example, begin with the elemental con- 
cepts of: 

• dimension (ID, 2D, 3D, 3+1D, n+lD, etc.), 

• metric (e.g. Pythagoras' space & Minkowski's 
space-time) ; 



13 



followed by the emergence in our world of manifesta- 
tions now represented by basic physical concepts like: 

• motion, momentum-energy, mass & gravity 

• other interactions like charge & electromagnetic 
force, 

• particles, waves, atoms & their associated chem- 
istry, 

• heat, an early application of gambling with uncer- 
tainty, 

• available work and mutual information in physical 
systems. 

Discoveries on earth then lead to the following inven- 
tions by one-celled organisms: 

• steady-state thermal-to- chemical (i.e. photosyn- 
thetic) and chemical to kinetic (i.e. motion) energy 
conversion, 

• energy storage in combustable sugars, and mecha- 
nisms for withdrawal (via enzymes) into more easily 
and universally spendable ATP molecules. 

• digital information partnering (trading work for 
recorded correlations) with highly replicable amino 
and nucleic acid codes, and thus in a sense the prac- 
tice (but not yet the idea) of genetic engineering; 

• intracellular structures like cell membranes & or- 
ganelles (symbiotic) and viruses (parasitic), 

• chemical and tactile messaging; 

• tools like thermal or chemical gradients for locating 
energy sources, & contact forces for motility; 

subsequent inventions by multi-celled plants of: 

• intercellular correlations like eukaryotic cells, sex- 
ual reproduction & microbe-assisted digestion 
(symbiotic), or bacterial infections (parasitic), 

• differentiated structures like circulatory systems, 
leaves, stalks, roots, flowers, & shells, 

• other intra- organism correlations such as an- 
nual/biennial reproductive cycles (symbiotic), or 
cancerous tissues (parasitic), and 

• inter- organism correlations, like ritualized inter- 
species redirection of behavior by providing animals 
with fruit and nectar symbiotically, or fake sex par- 
asitically, so as to distribute seeds & pollen, 

• messaging via hormone (intra-organism) & exterior 
design; 

• tools like gravity and wind as aids to reproduction; 



the invention by animals of: 

• information partnering with neural net patterns via 
sense-mediated action, of limited replicability per- 
haps greatest in ritualized songs, discovery dances, 
& warning vocalizations (especially for birds, bees, 
and mammals), 

• intra-organism structures like vertebrae, muscles, 
brains, eyes with lenses, gills, lungs, hearts, legs & 
wings, 

• intra-species aggression and it's ritualized 
redirections^, including greeting ceremonies, 
pair bonds & laughter, 

• family systems serving inward-looking perspectives 
with respect to intra-specific gene-pool boundaries, 
and related correlations like joint-parenting & con- 
structive sibling interactions (e.g. play between kit- 
tens) , 

• political systems serving outward-looking perspec- 
tives with respect to intra-specific gene-pool bound- 
aries, and associated correlations such as heirarchy 
in a wolf pack, 

• inter- organism messaging via sound, body lan- 
guage, and interior design (bauer), 

• tools like webs, levers, vines, tunnel, dam, wood, & 
stone; 

and finally the invention in human communities of: 

• information partnering with highly replicable spo- 
ken languages, print, and most recently digital 
codes, 

• available work production & distribution in these 
forms: food/drink, ritual (monetary), fossil fuel & 
electrical, 

• subsystem repair/augmentation networks like 
medicine, dentistry, pharmacy, auto repair, & 
physical therapy, 

• redirective elicitors (symbiotic & parasitic) of in- 
nate behavior (like eating, procreation & militant 
enthusiasm) include sports, "mind" chemicals, non- 
reproductive sex, self-help, psychotherapy, plus ar- 
tificial colors, flavors, smells & shapes (for food & 
individuals), 

• prediction activities like meteorology, gambling, in- 
surance, digital modeling, investing, polling & qual- 
ity control, 

• evolving pair, family, and heirarchy paradigms with 
roots in phylogenctic & mcmctic tradition, like 
votes, jury, public corporations, church/state sep- 
aration, free press, merit/goal-based management, 
human rights, 
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• belief systems serving inward-looking perspectives 
with respect to meme-pool boundaries, and related 
correlations that include religions (symbiotic) & re- 
ligious colonialism, plus ethnic, cultural, & artistic 
identities, 

• knowledge systems serving outward-looking per- 
spectives with respect to meme-pool boundaries, 
and related correlations that include professions, 
"political correctness", secular colonialism & peer 
review, 

• inter- organism messaging via music, art, writing, 
printing, teletype, radio, phonograph, telephone, 
photography, cinema, fingerprint, television, xerog- 
raphy, magnetic tape, bar code, optical disk, pager, 
sky phone & internet, 

• tools like clothes, fire, oven, wheel, ramp, 
plow, weapons, skyscraper, bike, road, 
steam/gasoline/electric motors, car, bridge, 
train, boat, aircraft, match, washing machine, 
vacuum cleaner, flush toilet, dishwasher, grinders, 
mirror, glass lens, camera, camcorder, hologram, 
clock, artificial light, metals, ceramics, concrete, 
canning, polymers, gear, cam, lock, spring, rope, 
pulley, block & tackle, zipper, scissors, nail, screw, 
irrigation, planting & harvesting equipment, 
wrench, portable drill/saw, end loader, running 
water, gas & electric heater & drier (for people, 
food, & clothes), elevator, crane, battery, volt-ohm 
meter, laser, compass, satellite, global positioning 
system, gyroscope, autopilot, smoke detector, 
fridge & air conditioner, X-ray & ultrasound & 
MM imaging, tele & micro & endo scopes, spec- 
trometers, semiconductors, transistors, integrated 
circuits, computers, fiber optics, robotics, and the 
idea of polymerase chain reaction for nucleic acid 
sequence replication. 

Thus correlations, written in nucleic or amino acid 
strings, have been developing in symbiosis with microbes 
since the very early days of our planet. Moreover, some- 
time since the Cambrian bloom of metazoan body types, 
and particularly among humans in the past 10,000 years, 
similar correlations written in memetic codes have been 
undergoing active development. The latter were of course 
broadcast not via the sharing of molecules, but by trans- 
ference between neural nets through metazoan senses via 
performance, speech, script, and more recently digital 
means starting with the Phoenician alphabet. 

The large number of thermodynamic and information- 
theoretic processes in this list raises a question about 
codes that arises often today in context of the human 
genome project: What gene is responsible for what fea- 
tures of an organism, or conversely what features of an 
organism does a given nucleic acid sequence "cause"? 
The same question of course can be asked about memetic 
codes. Has a given set of ideas been honed via experience 
with the world around us, via experience with worlds 



within this boundary or that, or does it offer little by 
way of connection to the world at all? I hope that we've 
shown here that in any rigorous sense such questions 
must be considered questions not about the properties 
of a molecule or a set of words in isolation, but rather 
questions about delocalized correlations between physical 
objects (in particular between codes or their phenotypes, 
and other parts of the world around). Once the context 
is specified (e.g. the reference state used in equation l36l). 
objective and even quantitative assessments of these cor- 
relations may be possible. 

Qualitatively, for example, most might agree that the 
nucleic acid base triplets UAA, UAG, and UGA have 
evolved as elements of punctuation in the genetic code, 
there not to correlate with the outside world but to guide 
the process of transcription into protein, much as the 
period at the end of this sentence guides the sentence's 
transcription into speech. Such punctuation codes are 
one kind of internal code, developed to guide the repli- 
cation of codes and their reduction to practice. Other 
codes have evolved by virtue of (i.e. their survival has 
been connected in a real-time manner to) the correlations 
that they affect between an organism and the inanimate 
world around. Thus a chunk of genetic code might corre- 
late with the thickness to length ratio of a plant's stem, 
whose optimum value may depend on wind velocities and 
topography in the world around. Similarly, a set of ideas 
for guiding the path of a ship at sea might survive de- 
pending on its usefulness in helping the sailors reach their 
destination, before they run out of supplies. 

Some kinds of internal code affect (and are affected by) 
the way manufacturing is carried out within cells. Oth- 
ers affect the ways cells interact with one another, and 
yet others affect the way tissues function as a unit, etc., 
across the levels illustrated in Figure Codes (genetic 
or memetic) whose survival is predicated primarily on 
correlations between or internal to lifeforms (rather than 
specifically between a lifeform and it's inanimate envi- 
ronment) might be called "we-codes" . Thus for example, 
many might agree that legal systems provide guidelines 
(in this case we-memes) for cooperation between more 
than one genetic subgroup or nuclear family. Clarifying 
our ideas about the ways that segments of code partic- 
ipate in correlations between the organisms they guide, 
and other parts of the world, is even more important now 
that genetic codes are being transribed by humans into 
memetic form. 

Examining any given correlation from this list quan- 
titatively (cf. Appendix [5| may or may not be mean- 
ingful. However, the list does make it easy to see why 
thermodynamic metaphors (e.g. as recently pointed out 
by a social-science student in a physics class here) seem 
relevant to processes found in even the most complex 
social systems, including economic systems that involve 
money (a ritually-conserved quantity designed for porta- 
bility). Thus management of energy flows through vari- 
ous forms of available work, ways to thermalize that en- 
ergy reversibly so as to create and preserve correlations 
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between and within organisms and their environment, 
and the storage of information in increasingly more repli- 
cable forms, are central and recurring parts of life's ad- 
venture. 



VIII. NET SURPRISAL & THE UNEXPECTED 

Entropy, a measure of expected or average surprisal, 
has been important in thermodynamics since the work 
of Clausius in 1865, although its firm connection to in- 
formation measures is more recent. Net surprisals, de- 
fined as a difference in average surprisals between one 
state of information and another (the second being of- 
ten some reference or equilibrium state), were initially 
referred to by Gibbs in 1875 as availabilities^. Gener- 
alized availabilities, negative logarithms of the partition 
function as shown above, are deeply rooted in the math- 
ematics of statistical inference, and increasingly recog- 
nized for their connection to both free energy22i2i and 
complexity 51 . Lloyd's measure of complexity via depth is 
in the same category 5 ^. 

Although we stop with the discussion of net surprisals 
here, it is difficult to resist pointing out that lifeforms, as 
information engines in symbiosis with codes which sur- 
vive by replication, have a vested interest in being able 
to distinguish alternatives with high net surprisal rela- 
tive to an expected ambient (i.e. with respect to what 
is common). The noble passions ala Fig [3 e.g. for be- 
ing a good friend/mentor, sibling/parent, citizen/leader, 
believer/cleric, and witness/scholar, may be considered 
evidence of interest in recognizing net surprisal. 

Corollary symptoms of preoccupation with recognizing 
high net surprisal include: (a) the positive importance in 
human culture of attributes like special, unique, or rare; 
(b) the human appetite for variety, pleasant surprises, 
and even gambling; (c) the importance of special recog- 
nition throughout life, including the need for attention 
as a youngster and the need to signify, have great "dis- 
cretionary power", or be famous as an adult; (d) the im- 
portance of humor, and the discovery of novel, fortuitous, 
and/or surprising connections in our language and behav- 
ior as a kind of "dessert" for us information processors at 
the end of a long day's work; (e) the use in the vernacu- 
lar of adjectives like "heated" , "hot" , and "steamed" for 
situations in which action dominates thought (e.g. high 
eV/bit), and adjectives like "cool" for situations in which 
information dominates by comparison; and (f) of course 
the desire to see the genetic and memetic codes, that 
we've had a hand in designing, fare well with challenges 
posed by their environment in days ahead. 



IX. CONCLUSIONS 

Statistical physics is perhaps the most quantitative 
tool available for the generalist, in that it allows one with 
meager information to make rigorous assertions that a 



subset of outcomes are going to be impossible in prac- 
tice. Perpetual motion machines^ are the classic exam- 
ple. Although the calculation methods, and some of the 
examples used above, are old, the problems they can ad- 
dress are contemporary. 

As tools in the "science of the possible" these methods 
can be used to show (for example) that reversible meth- 
ods for converting high temperature heat to low temper- 
ature heat for home heating could reduce the energy cost 
of heating by an order of magnitud e) 18 ! 53 , and that go- 
ing to the store to buy a package of automobile seeds 
is not an inconceivable alternative for our descendants a 
century from now^i. Awareness of mutual information is 
crucial to our understanding of both quantum computers 
of the future 3 ^, and the molecular machines for replicat- 
ing nucleic acid sequences which keep us going today2&. 

Concerning the information theory paradigm itself, 
Amnon Katz said in the preface to his 1967 book 9 that 
writing a book on the information theory approach to 
statistical physics was worthwhile to him primarily be- 
cause it provides a coherent overview for the novice. He 
said that his book found little favor with the experts in 
statistical mechanics, because they already knew how to 
pose questions and get answers. Part of the disinterest in 
a new way to look at things on the part of experts 5 ^ was 
likely paradigm paralysis of the same sort that prompted 
Swiss watchmakers in the late 1960's to discredit as un- 
interesting, and to eventually give away, their own inven- 
tion of the quartz-movement watch along with most of 
their market share in the watchmaking industry 5 ^. 

Now, a half-century later, the pervasive influence of 
the paradigm shift on mid-level physics texts, and the 
experimental impact of mutual information in molecu- 
lar biology and nano-computing research, has left skep- 
tics (even if they are legitimately tired of hand-waving 
metaphors) with little to hang onto except the large 
size of Avogadro's number, which makes uncertainty in- 
creases associated with heat flow tens of orders of mag- 
nitude larger than those associated with the traditional 
objects of gambling theory 5 ^. The good news here is 
that the paradigm shift offers additional food for thought 
to students not majoring in physics (especially those in- 
volved in the code-based sciences). Of course it is in 
part the responsibilty of physicists to provide such stu- 
dents, in an introductory course, with physical insight 
into quantitative ways for putting these tools to use. 



APPENDIX A: MULTIPLE CHOICE MAXENT 

Suppose we wish to determine the "state" of a pop- 
ulation of individuals with respect to the way they will 
respond to the questions on an N question, m choice 
multiple-choice test. The only information that we have, 
however, is the average number of correct answers n = (j) 
by members of that population. 

Let's examine the situation more closely. There are m 
ways to answer each question, so there are m N ways to 
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FIG. 7: Guessed fraction of students versus the number of 
correct answers out of 20 questions, when the average grade 
for the class is 10, 12, 14, 16 and 18. The dashed line is the net 
surprisal (in bits per question) of the distribution with given 
average, relative to that expected for 20 randomly answered 
true-false questions. 



respond to the test as a whole. For each possible number 
of correct answers je{0, N} there are 



number of states = 



(m - 1) 



N-j 



dj. 



(Al) 



The function dj is called the density of states. Using 
it, we can write the results of the entropy maximization 
from equations 1171 1181 and 1191 respectively, as 



Pj = 



N 
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The value of A is found by solving the implicit equation 
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When the djpj are plotted as a function of j for a given 
value of k (cf. FigEJ) , one gets the binomial distribution. 
This in turn for large N and values of k not too large or 
small may be approximated by the continuous Gauusian 
distribution or Bell curve. When N is large but the k 
is very small by comparison, it reduces to the Poisson 
distribution, useful for predicting random but unlikely 
events like the distribution of meteor impacts. The net 
surprisal per question (dotted line in FigQ) quantifies the 
amount of mutual information about course material (rel- 
ative to random answers on a true-false test) evidenced 
by students with a given mean score. This net surprisal 
measure (J ne t = kN In m — S) can help teachers take into 
account the fact that a score of N/m correct, on an m- 
choice N-question test, provides zero evidence of student 
learning since random guesses would on average yield the 
same result. 



The entropy of the state distribution S also quanti- 
fies our uncertainty about the response of any particular 
member of the population to the test, given only the pop- 
ulation average grade. There is an alternative way to look 
at it as well. Suppose there was some physical process 
operating which acted only to hold the average grade at 
the specified value, but otherwise had no effect on the 
probability distribution. If that were the only physical 
process determining the outcome of the test, then we 
might expect the actual spread in test responses to agree 
with the distribution of responses predicted by equation 
IA2I which guesses based only the measured value for k. 
Thus even given detailed information on the set of re- 
sponses to the test, it would help little in decreasing our 
uncertainty when trying to predict responses because the 
constraint on k was the only physical process constraining 
the distribution. 

This is more than idle speculation. Indeed the main 
control variable in educational testing is often the "dif- 
ficulty" of the questions, and actual test responses for 
a given average grade do tend to follow the distribution 
predicted in equation IA2I Failure of this to occur is thus 
evidence for other constraints operating (e.g. the pres- 
ence of two populations of students). By the same token, 
agreement between the predicted distribution and the ac- 
tual one cannot, however, be taken as evidence that no 
other physical principles are active in the system, only 
that the results of the test probably tell us little about 
them. 

In the foregoing example, the response of a popula- 
tion sample to a test was classified into a set of "mi- 
crostates" . The uncertainty about the response given 
only information about the average grade was shown to 
agree with the spread in measured values, from which it 
was inferred that the major physical constraint acting on 
the response distribution was in effect one which deter- 
mined the average value. In a more general sense when 
we consider all of the physical microstates accessible to 
a given system, the physical entropy of that system is 
defined as the uncertainty calculated when all externally 
detectable constraints on the state distribution are used 
as constraints in the maxent calculation used. It is thus 
the minimum uncertainty possible, based on all of the 
"mutual information" about the system available to the 
world outside. 



APPENDIX B: 



MUTUAL INFORMATION 
BASICS 



A discussion of correlated subsystems which, between 
themselves, house mutual information must begin with a 
definition of subsystems. Such subsystems are variously 
defined, for example, as individual particles, as collec- 
tions of particles, as individual states (which may or may 
not be occupied with particles), and as regions or con- 
trol volumes in and out of which energy and mass might 
flow. Our earlier distinction between a steady state en- 
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gine and its environment, as well as the molecule, cell, 
tissue, metazoan, gene-pool and meme-pool boundaries 
discussed in section IvTl also fall in this category. 

Even the isolated-system second law itself must first 
separate the world into observed-system and observer, 
because as we mentioned earlier it also is an assertion 
about the time evolution of mutual information, namely: 
The correlation information that an observed physical 
system has with an environment from which it is iso- 
lated will not decrease with time, and will likely increase 
since information available to the environment needed to 
model propagation of the isolated system through time 
may fall short. 

Once some hopefully useful boundaries for the sub- 
systems of interest in a given problem have been de- 
fined, a set of A^s-subsystem joint probabilities can be 
defined. For N ss = 2, joint probabilities p[ij] > 
obey ~^2i^2jP[ij] = 1< Here the i indices run over all 
possible states for subsystem I, while the j indices run 
over all possible states for subsystem J. From the joint 
probabilities one can calculate marginal probabilities like 
p[i] = ^2jP[ij], which ignore the state of other subsys- 
tems, and conditional probabilities like Pi[j] = asso- 
ciated with a specific state of one subsystem (here the ith 
state of subsystem I). From these probabilities then val- 
ues for joint entropy S[IJ]/k = (— lnp[ij]), marginal en- 
tropies like S[I]/k = (— lnp[i]) and conditional entropies 
like Sj[J]/k = (— lnpi[j]) follow immediately. Mutual or 
correlation information between systems I and J, denoted 
here as M[I,J] and defined by equation 1421 as the sum 
of marginal entropies S[I] + S[J] minus the joint entropy 
S [I J], thus becomes 

M[I,J] = _fc£]T^]ln*M > o. (Bl) 

Thus from equation 1361 it appears that mutual informa- 
tion is simply the net surprisal that follows upon learning 
that systems (here I and J) are not independent. 

Examples of correlated subsystem pairs include pho- 
ton or electron pairs with opposite but unknown spins, 
a single strand of messenger RNA and the sequence 
of nucleotides in the gene from which it was copied, a 
manuscript and a copy of that manuscript created with 
a xerox machine (or a video camera), your understand- 
ing of a subject before being given a test and the answer 
key used by the teacher to grade that test (hopefully), 
enzymes and coenzymes with site specificity, tissue sets 



treated as friendly by your immune system, metazoans 
who developed from the same genetic blueprints (e.g. 
identical twins), families that share similar values, and 
cities which occupy similar niches in different cultures 
(e.g. sister cities). As you might imagine this list of sub- 
system pairs is incomplete, and many of the quantities 
listed remain difficult to quantify. 

With increasing N ss , the number of marginal and con- 
ditional entropies increases rapidly, and many new mu- 
tual information terms (all positive) emerge as well. For 
example, when N ss — 3, marginal probabilities exist 
which ignore either one or two sub-systems (e.g. like 

Pfol = EfcPK? fc ] and = J2jJ2kPi i J k })- Similarly 
conditional probabilities can specify the state of one 
or two sub-systems. These all give rise to analogous 
marginal and conditional entropies. Lastly, a set of seven 
mutual information terms can be calculated: the joint 
correlation M[I,J,K] = S[I] + S[J] + S[K] - S[IJK], 
three one-on-two terms like M[I, JK] = S[I] + S[JK] - 
S[IJK], and three one-on-one terms like M[I, J] = 
S[I] + S[J]- S[IJ]. 

These mutual information terms all have positive val- 
ues which are independent of argument ordering, e.g. 
M[I, J] = M[J,I]. One useful identity is M[I,J,K] = 
M[I, JK] + M[J, K], which in words says that "the joint 
mutual information of systems I, J and K is the cor- 
relation information between system I and system JK, 
plus that between systems J and K" . Another relation- 
ship that we conjecture here is M[I, JK] > M[I, J], or 
in other words: "System I and system JK have at least as 
much in common as do systems I and J alone" . 

The maxent formalism, of course, automatically esti- 
mates joint probabilities, from which all of these quan- 
tities follow. Figuring out how to constrain the maxi- 
mization with knowledge of mutual information between 
subsystems is therefore the primary challenge in adapting 
it to such problems. 
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